Fast PageRank Computation Via a Sparse Linear System (Extended Abstract)
نویسندگان
چکیده
The research community has devoted an increased attention to reduce the computation time needed by Web ranking algorithms. Many efforts have been devoted to improve PageRank [4, 23], the well known ranking algorithm used by Google. The core of PageRank exploits an iterative weight assignment of ranks to the Web pages, until a fixed point is reached. This fixed point turns out to be the (dominant) eigenpair of a matrix derived by the Web Graph itself. Brin and Page originally suggested to compute this pair using the well-known Power method [12] and they also gave a nice interpretation of PageRank in terms of Markov Chains. Recent studies about PageRank address at least two different needs. First, the desire to reduce the time spent weighting the nodes of the Web Graph which takes several days. Second, to assign many PageRank values to each Web page, as results of PageRank’s personalization [14–16] that was recently presented by Google as beta-service (see http://labs.google.com/personalized/). The Web changes very rapidly and more than 25% of links are changed and 5% of ”new content” is created in a week [8]. This result indicates that search engines need to update link based ranking metrics (such as PageRank) very often and that a week-old ranking may not reflect very well the current importance of the pages. This motivates the need of accelerating the PageRank computation. Previous approaches followed different directions such as the attempt to compress the Web Graph to fit it into main memory [3], or the implementation in external memory of the algorithms [13, 7]. The very interesting research track exploits efficient numerical methods to reduce the computation time. These kind of numerical techniques are the most promising and we have seen many intriguing results in the last few years to accelerate the convergence of Power iterations [18, 13, 21]. In the literature [1, 21, 23] are presented models which treat in a different way pages with no out-links. In this paper we consider the original PageRank model (see Section 2) and, by using numerical techniques, we show that this problem can be transformed in an equivalent linear system of equations, where the coefficient matrix is as sparse as the Web Graph itself. This new formulation of the
منابع مشابه
Fast PageRank Computation via a Sparse Linear System
Recently, the research community has devoted increased attention to reducing the computational time needed by web ranking algorithms. In particular, many techniques have been proposed to speed up the well-known PageRank algorithm used by Google. This interest is motivated by two dominant factors: (1) the web graph has huge dimensions and is subject to dramatic updates in terms of nodes and link...
متن کاملEuler-Richardson method preconditioned by weakly stochastic matrix algebras: a potential contribution to Pagerank computation
Let S be a column stochastic matrix with at least one full row. Then S describes a Pagerank-like random walk since the computation of the Perron vector x of S can be tackled by solving a suitable M-matrix linear system Mx = y, where M = I − τA, A is a column stochastic matrix and τ is a positive coefficient smaller than one. The Pagerank centrality index on graphs is a relevant example where th...
متن کاملFast ranking algorithm for very large data
In this paper, we propose a new ranking method inspired from previous results on the diffusion approach to solve linear equation. We describe new mathematical equations corresponding to this method and show through experimental results the potential computational gain. This ranking method is also compared to the well known PageRank model. Keywords-Large sparse matrix, Iteration, Fixed point, Pa...
متن کاملApproximation of Largest Eigenpairs of Matrices and Applications to Pagerank Computation
In this work, we propose di erent approaches, for the treatment of the following problems: (i) computation of the largest eigenvalue of a matrix and the corresponding eigenvector when neither is known, (ii) computation of the eigenvector of a matrix corresponding to its largest eigenvalue when this eigenvalue is known. The matrix is arbitrary, large, and sparse. We treat the rst problem by Kryl...
متن کاملApproximating Personalized PageRank with Minimal Use of Web Graph Data
In this paper, we consider the problem of calculating fast and accurate approximations to the personalized PageRank score ([8, 16]) of a webpage. We focus on techniques to improve speed by limiting the amount of webgraph data we need to access. PageRank scores are mainly used for ranking purposes, and generally only the scores exceeding a given threshold are relevant. In practice, and relative ...
متن کامل